A SELF-LEARNING ALGEBRAIC MULTIGRID METHOD FOR 
EXTREMAL SINGULAR TRIPLETS AND EIGENPAIRS 
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Abstract. A self-learning algebraic multigrid method for dominant and minimal singular triplets 
and eigenpairs is described. The method consists of two multilevel phases. In the first, multiplicative 
phase (setup phase), tentative singular triplets are calculated along with a multigrid hierarchy of 
interpolation operators that approximately fit the tentative singular vectors in a collective and self- 
learning manner, using multiplicative update formulas. In the second, additive phase (solve phase), 
the tentative singular triplets are improved up to the desired accuracy by using an additive correction 
scheme with fixed interpolation operators, combined with a Ritz update. A suitable generalization of 
the singular value decomposition is formulated that applies to the coarse levels of the multilevel cycles. 
The proposed algorithm combines and extends two existing multigrid approaches for symmetric 
positive definite eigenvalue problems to the case of dominant and minimal singular triplets. Numerical 
tests on model problems from different areas show that the algorithm converges to high accuracy in 
a modest number of iterations, and is flexible enough to deal with a variety of problems due to its 
self-learning properties. 
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1. Introduction. In this paper we present an algebraic multigrid (AMG) method 
for accurately computing a few of the largest or smallest singular values and associ- 
ated singular vectors of a sparse rectangular matrix A £ M mxn . Let the singular 
value decomposition (SVD) of A be given by 

A = UY,V l . (1.1) 

Here, U £ ]R mxm and V £ R nXn , with U l U = I m and V t V = I n , where I m and 
I n are the unit matrices of sizes m x m and n x n, respectively. Matrix X £ JR mxn 
has the / — min(m, n) singular values o\ > a 2 > . . . > 07 > of A on its diagonal. 
In what follows we will normally assume that m > n, except where noted otherwise. 
The columns Uj of U are called the left singular vectors of A, and the columns Vj of V 
are its right singular vectors. The n singular triplets (<jj, Uj,Vj), j = 1, . . . , n, satisfy 

A Vj = a 3 u h 
A l u 3 ^a,v r (1.2) 

For the special case that A is square and symmetric positive definite (SPD), the SVD 
of A coincides with the eigendecomposition of A, and a suitably simplified version 
of the AMG method we propose in this paper will be applicable to the problem of 
computing a few of the largest or smallest eigenvalues and associated eigenvectors of 
an SPD matrix A. 

For definitencss. we will frame the presentation in most of the paper in terms of 
calculating a few of the singular triplets with largest singular values (which we call 
dominant triplets), and we will comment on the case of the singular triplets with the 
smallest singular values (which we call minimal triplets) at the end of the algorithm 
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presentation. So we assume we seek the nj, dominant singular triplets (crj,Uj,Vj), 
j = 1, . . . , n&, of A, with singular values a\ > cr 2 > • • ■ > ^Vii,- 

There are many applications in scientific computing where dominant or mini- 
mal singular triplets of large sparse matrices need to be computed, see, for example, 
the discussion and references in [26] [2]. We mention a few examples. Latent se- 
mantic indexing determines concepts in documents by calculating dominant singular 
triplets of term-document matrices |16j . Similarly, principal component analysis is 
used in exploratory data analysis to identify orthogonal components with maximal 
variance, which correspond to dominant singular triplets of the data matrix [24] . In 
[13] . a smoothed aggregation method is described for nonsymmetric linear systems 
that arise from partial differential equation (PDE) discretization, and which requires 
approximate calculation of the minimal singular triplet of the problem matrix in a 
setup phase of the solver. Similarly, calculating dominant or minimal cigenpairs of 
SPD matrices also has many applications, see, e.g., [231 FH l2"2l |2"TI 15] . 

The computation of a few extremal singular triplets of large sparse matrices has 
been the focus of many research efforts, see, for example, |26[ [2] and the numerous 
references therein. In recent times, Lanczos bidiagonalization methods and subspace 
iteration methods have received significant attention. Singular triplets can also be 
computed by applying symmetric eigenvalue solvers to A* A or the augmented oper- 
ator 



X = 



A 
A 1 



(1.3) 



but the first approach can lead to poor accuracy of the computed singular values when 
A is ill-conditioned. For the second approach the amount of storage and work required 
can be prohibitive, the number of iterations required to compute a given number of 
singular values increases, and the indefinitcness of operator X has to be dealt with 
[2"5] . For these reasons, methods are being pursued that avoid working on operators 
A 1 A and X [28) [26] [2], and we do the same in this paper. It appears that multilevel 
methods have not been explored yet for the calculation of singular triplets working 
directly on Eqs. (|1.2j) . This is, perhaps, not surprising, since AMG methods for the 
SPD cigcnproblcm are also still quite a young area [7] [4] [22] [27] . It can be expected 
that AMG methods for extremal singular triplets will be competitive for problems 
in which the extremal singular values are highly clustered and the extremal singular 
vectors are similar to each other such that they can be represented well collectively by 
an interpolation operator that interpolates coarse-grid representations of the singular 
vectors to the fine grid. Nonsymmetric discretized elliptic PDE operators are expected 
to have this kind of spectral decomposition. We will investigate such a problem in 
the numerical results section of our paper, but we think that it is also interesting to 
investigate the applicability and performance of our algorithm for other, more general 
SVD problems, and we do so in the numerical results section as well. Numerical 
results will also be presented for SPD cigcnproblems, since our algorithm offers a new 
extension of previous approaches for this type of problems as well. 

Algebraic multigrid was originally developed for solving sparse systems of linear 
equations (see [6] and references in [34] and [15]). Over the years, its applicability 
has been extended in several ways, including to SPD eigenvalue problems [8] [7] [4] 
[22] [27] . The AMG method we propose belongs to the class of self-learning AMG 
methods (we borrow this term from |30j). In these methods, a multigrid hierarchy is 
built with interpolation operators that are determined adaptively and iteratively over 
several multilevel cycles, to match approximately the vectors that arc of interest in 
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the problem at hand. For linear system solvers, these are the vectors that lie close 
to the null-space of the matrix, and for eigenvalue problems, they are the desired 
eigenvectors. In our new method for singular triplets, they will be the desired singular 
vectors. Self-learning AMG solvers are an active area of research and have been 
developed for solving linear equation systems, SPD eigcnproblems, and Markov chain 
problems, see, for example, JS1 [HI [III [121 El HH1 HH1 IH1 1^3 [HHl HH1 1113 - 0ur AMG 
method is also collective, in that it strives to represent several singular vectors by a 
single interpolation matrix for efficiency. 

The AMG method we propose for computing dominant singular triplets consists 
of two multilevel phases. It combines and extends two existing AMG approaches for 
the SPD cigenproblem, that were proposed by Borzi and Borzi in [4] and by Kushnir, 
Galun and Brandt in |27j . In the first, multiplicative phase (setup phase), we calculate 
tentative singular triplets and a multigrid hierarchy with interpolation operators that 
approximately fit the tentative singular vectors in a collective and self-learning man- 
ner. This phase uses power method relaxation and multiplicative coarse-grid update 
formulas for the tentative singular vectors. We use the bootstrap framework [8] in this 
phase with least-squares fitting and random initial singular vectors, in a way similar to 
the approach described by Kushnir, Galun and Brandt in |27] for calculating minimal 
eigenpairs of an SPD matrix. In other related work, the setup phase of the algorithm 
described in [T3] calculates an approximation of the singular vectors that correspond 
to the smallest singular value of a square nonsymmctric matrix, in a way that is less 
general than but similar to our multiplicative phase. In [27], great care is taken to 
try to make the interpolation operators highly accurate for all eigenvectors, in the 
spirit of the exact interpolation scheme (EIS) [7] , leading to an eigenvalue solver that 
only employs this first, multiplicative phase, with accuracy limited to the accuracy 
by which the single interpolation operator represents each eigenvector. In our ap- 
proach, however, we use generic interpolation that fits the tentative singular vectors 
only approximately, and we employ a second, additive phase (solve phase), in which 
the tentative singular triplets are improved up to the desired accuracy by using an 
additive correction scheme with fixed interpolation operators, combined with a Ritz 
update. Our additive phase is similar to the approach described by Borzi and Borzi in 
[4] for calculating minimal eigenpairs of an SPD matrix (which itself is an extension of 
[5]), but in [4] standard AMG interpolation is used, and there is no initial multiplica- 
tive self-learning phase. Our hybrid multiplicative-additive approach results in a new 
AMG method for extremal singular triplets that combines two desirable properties: 
it allows for high-accuracy convergence when desired, and it is flexible enough to deal 
efficiently with a variety of problems due to its self-learning properties. The special- 
ization of our algorithm to the SPD eigenpair case also leads to a new extension of 
the AMG eigenvalue algorithms of [4] and [27] that has the same desirable properties. 

The remainder of this paper is structured as follows. In the next section we 
give a brief description of multiplicative and additive two-level schemes for solving 
(A— XI) x = 0, with A a square SPD matrix and A an assumed given, fixed eigenvalue. 
This will serve to elucidate under which circumstances multiplicative and additive 
update formulations can be equivalent for calculating eigenvectors, and to illustrate 
when it may be beneficial for accuracy and computational cost reasons to append a 
phase with additive cycles to an initial multiplicative, self-learning phase. This will 
set the stage for the description of the multiplicative (setup) phase of our singular 
triplet algorithm in Section [3] This section also introduces a suitable generalization 
of the SVD for formulating the coarse-level problems. Section @] then describes the 
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additive (solve) phase of the algorithm. Section [5] describes how it can be extended 
and specialized to the case of square matrices, minimal singular triplets and extremal 
eigenpairs of SPD matrices. Section [5] contains extensive numerical evaluation of our 
algorithm, and Section [7] concludes. 

2. Two-level Methods for (A — X I) x = 0. In this section, we consider mul- 
tiplicative and additive two-level methods for calculating an eigenvector of a square 
SPD matrix A g M mxm , assuming, for now, that the eigenvalue A is known. This 
academic discussion serves to highlight the principles behind the multiplicative and 
additive approaches, and how they are related and can be combined for calculating 
eigenvectors. The insights gained will motivate the approach of our multilevel algo- 
rithm for calculating dominant singular triplets. Assuming eigenvalue A is known, we 
seek a nontrivial solution x to equation 

(A-XI)x = 0, (2.1) 

with / generically denoting the unit matrix. For definiteness, we can simply assume 
that dim(ker(A — XI)) = 1, and that we seek a solution with ||x||2 = 1- We will 
consider two-level iterative schemes with relaxations on the fine level (or fine grid) 
combined with coarse-grid corrections that are obtained via solving a smaller problem 
on a coarse grid. 

2.1. Multiplicative Correction Scheme. Let a;W be the current fine-grid 
approximation and e.\J ult be its multiplicative error, such that 

z = diag(x«)e« H , (2.2) 

where x is the exact solution of the problem, and diag(a;^^) is a diagonal matrix with 
scW on its diagonal. Note that, at convergence, when xW = x, the multiplicative 

(i) 

error satisfies e^ ult — 1, with 1 the vector of all ones. The problem at hand can be 
rewritten as 

(^-A/)diag(x«)e« utt = 0, (2.3) 

in which we seek the unknown multiplicative error , t . We consider a coarse grid 
with m c unknowns (which may be a subset of the fine-grid unknowns), and, instead of 
the fine-level multiplicative error, e^ ult , we seek to compute a coarse-grid multiplica- 
tive error e mu it,c, which, when interpolated up to the fine grid, would approximately 
equal the unknown fine-level multiplicative error. This may be an inexpensive way 
to improve the fine-level error, since e mu it. c can be computed inexpensively on the 
coarse grid. So we seek e mu ;t lC such that 

Q e mu it,c ~ e£ ult , (2.4) 

with Q e jfjmxmc a coarse-to-fine interpolation matrix for the error, which we require 
to satisfy Q l c = 1 (with l c the coarse- level vector of all ones). Combining Eqs. (|2.3I) 
and (|2.4p and with the help of a restriction operator, R G ]R maXm , we arrive at the 
following m c x m c system of equations for e mu it,c'- 



R(A-XI) diag(x (i) )Q e muH , c = 0. 



(2.5) 
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Eqs. (|2.2p and (|2.4[) then lead to the multiplicative coarse-grid correction formula for 
the fine-grid approximation: 

x^ +1 *> =diag(x«)Qe TBaitl c. (2-6) 
It is also useful to define the interpolation matrix P £ ^my.m a ^ gj ven by 

P = dmg(x^)Q, (2.7) 

which has the property that the current fine-grid approximation, x^\ lies exactly in 
its range, namely, 

x w =Pl c . (2.8) 

More generally, we have that there exists a coarse-level vector e^ ult (and we know 
which one) such that 

* W =P* l lit,c- ( 2 - 9 ) 
Using interpolation operator P, the coarse-grid equation becomes 

R{A-XI)Pe muUjC = Q, (2.10) 

and the multiplicative coarse-grid correction formula 

x^=Pe mult>c . (2.11) 

Considering a two- level method with coarse- grid correction according to Eq. (|2.11[) , 
we note that, for the exact solution a; to be a fixed point of such a two-level method, 
x indeed needs to lie exactly in the range of P at convergence, and the multiplicative 
correction scheme described above assures this by having x^ lie exactly in the range of 
P in each iteration. It is important to realize that this is required for the multiplicative 
scheme to converge to the exact solution. Since P (and possibly R) change in every 
iteration to adapt to the solution sought, we call this multiplicative two- level scheme 
self-learning. 

Note also, that one can always consider a rescaled coarse-level unknown quantity, 
say x c , using a diagonal scaling matrix W, 

e mu u, c = Wx c , (2.12) 

and formulate the multiplicative scheme in terms of x c . For example, defining P = 
PW, one solves coarse- level problem RAP ' x c = and corrects with multiplicative 
coarse-grid correction formula x^ +1 ^ = P x c . Such an x c is generally not a multi- 
plicative error anymore (since it does not hold that x c = l c at convergence), but can 
be some kind of coarse-level representation of the fine-level exact solution x. The 
current iterate, x^ 1 ' , still lies exactly in the range of P in each iteration. This view- 
point is adopted in the derivation of the multiplicative correction scheme as an Exact 
Interpolation Scheme [7J, while our derivation is the more common viewpoint in the 
context of Markov chains [531 HH1 HH] ■ We take this viewpoint here to highlight the 
similarity between the multiplicative and additive error correction formalisms, as will 
be discussed now. 
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2.2. Additive Correction Scheme. Multigrid for linear systems of equations 
is normally formulated in an additive-correction framework |15j . Define residual 
of current approximation x^> as 

rW = -{A- XI) x^. (2.13) 
The additive error, e a l dd , satisfies 

x = e { % d + x (i) , (2-14) 
and the problem at hand can be rewritten as error equation 

(A-AI)e« 1 = rW. (2.15) 

We seek to compute a coarse-grid additive error, e ad d,c, which, when interpolated up 
to the fine grid, would approximately equal the unknown fine-level additive error. So 
we seek e a dd,c such that 

Pe add>c ^e a % (2.16) 

for some coarse-to-fine interpolation operator P 6 ]ftmxmc Combining Eqs. (|2.15l) 
and (|2.16[) and with the help of a restriction operator, R G ]R m " Xm ) we arrive at the 
following m c x m c system of equations for e ad d.c- 

R(A-\I)Pe aM , c = Rr®. (2.17) 

Eqs. (|2.14j) and (|2.16[) then lead to the additive coarse-grid correction formula for the 
fine-grid approximation: 

ar (4+1) =x {l) +P eadd.c. (2.18) 

Fast convergence of the two-level process requires that additive error components that 
are not significantly reduced by fine-level relaxation lie approximately in the range of P 
(and can thus be removed by coarse-grid correction). First consider P = diag(x^) Q 
as above, see Eq. (|2.7[) . This means that x^ % > lies exactly in the range of P, and, close 
to convergence, x will lie approximately in the range of P as well. This means that 
e ald = x ~ x ^ a l so nes approximately in the range of P, so the 'self-learning' P from 
Eq. (|2.7|) is expected to give a suitable interpolation operator also for the additive 
correction scheme. 

In fact, if the same P is used as in the multiplicative method (with x^' lying 
exactly in the range of P, Eq. (|2.9p ) in every iteration of the additive scheme, and 
if R is also taken the same as in the multiplicative scheme, then the additive and 
multiplicative schemes are exactly equivalent. This can easily be seen as follows. 
First, using Eq. (|2.13[) and Eq. (|2.8p . additive coarse-level equation Eq. (|2.17[) can be 
rewritten as 

R(A-\I)Pe addiC = Rr® = -R(A- \I)x® = —R (A — XI) P l c , (2.19) 
R(A-XI)P(e add , c + l c ) = 0, (2.20) 

and by identifying 

&add,c H~ lc &mult,c-, (2.21) 
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one obtains the multiplicative coarse-grid equation, 

RAPe m uit,c = 0. (2.22) 

Eq. (|2.21[) has the nice interpretation that at convergence, on the coarse grid, the ad- 
ditive error, e a dd,c, vanishes, and the multiplicative error, e m uit,a equals l c . Similarly, 
using Eq. (|2.9[) . the multiplicative coarse-grid correction formula can be obtained from 
the additive coarse-grid correction formula: 

x (i+i) = x ® + P Gaddc = p (i c + e addtC ), 

% ^ — P &mult,C' (2.23) 

This shows that the additive scheme with R and P chosen (in every iteration) as in 
the multiplicative scheme (with current iterate x^' exactly in the range of P), is fully 
equivalent with the multiplicative scheme (in exact arithmetic). 

However, unlike the multiplicative scheme, the additive scheme can still converge 
if a;W (and thus ej^L) lies only approximately in the range of P. Therefore, one ap- 
proach to obtaining a convergent additive method is to first (adaptively) determine 
P (and R) in a few multiplicative cycles, and then freeze R and P for subsequent ad- 
ditive cycles. Additive cycles with frozen R and P are much cheaper computationally 
than cycles in which P (and possibly R) are modified in each iteration, often with- 
out sacrificing convergence speed too much, and the resulting hybrid method may 
be significantly cheaper than a fully adaptive method, since all coarse-level opera- 
tors are kept constant in the additive phase. This is one reason to consider hybrid 
multiplicative-additive methods for eigenvalues (see also [3J |3E1 [20] for application 
of this approach in the Markov chain context). The original multigrid method for 
solving linear equation systems is formulated in the additive framework, with fixed 
R and P that are determined using a-priori knowledge of the problem. Self-learning 
methods for linear systems of equations first perform some multiplicative, self-learning 
setup cycles to determine suitable interpolation operators, before proceeding with ad- 
ditive cycles with fixed interpolation [SJ HU E3 13 1201 HI EI] • In the context of 
self-learning solvers for linear equation systems, the first, multiplicative self-learning 
phase is often called the setup phase, because it is merely used for setting up the 
solver, and no approximation to the solution of the linear system Ax = b is sought 
in the multiplicative phase. When computing eigenpairs, however, the multiplicative 
phase is not merely a setup phase, since it also iterates on approximations for the 
eigenpairs, and it can be used as an eigensolver by itself, as in [221 13 EEH HH I2"T] . 
For this reason, we more generally refer to it as a multiplicative phase, in the present 
context of eigenvalue and singular value problems. 

In this paper wc consider methods to compute a few dominant singular triplets or 
eigenpairs that arc hybrid multiplicative-additive, not only for performance reasons, 
but mainly for the following reason: we will seek to formulate multilevel methods in 
which, for efficiency, the same interpolation matrix P can be used to approximate 
several singular vectors or eigenvectors at the same time. This interpolation matrix 
will not contain all of these singular vectors or eigenvectors in its range exactly, so 
a multiplicative scheme will only converge up to the accuracy by which the vectors 
are collectively represented by the interpolation matrix. A multiplicative scheme will 
be used to initiate the calculations and approximately identify the dominant singular 
vectors or eigenvectors, determining suitable interpolation operators in the process. 
Rather than attempting to construct interpolation operators that are very accurate 
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for all vectors sought (as is done in [37] for eigenvalue calculation), we will switch to 
an additive scheme in our singular triplet method, mainly because it can converge 
with high accuracy for all vectors sought, and with the added benefit that it will be 
inexpensive per cycle. 

3. AMG SVD Algorithm: Multiplicative Phase. We now go back to the 
general setting of our paper in which we want to compute dominant singular triplets 
(a, u, v) of rectangular matrix A 6 JR mxn , satisfying 

A t u = crv. (3.1) 

In this section, we formulate the multiplicative phase of the algorithm. (Note that we 
will redefine the interpolation and restriction matrices P, Q, R, etc.) 

3.1. Coarse-level Equations. Consider interpolation matrices P for u and Q 
for v, with P e jR mxm e an d Q g JR nXn °, and P and Q of full rank. First assume 
that u lies exactly in the range of P, and v in the range of Q, so 

u = Pu c , 

v = Qv c , (3.2) 
for some coarse- level vectors u c and v c . We define coarse- level equations 

P* AQv c = crP* BPu Cl 

Q l A 1 Pu c = aQ l CQv c , (3.3) 

and coarse-level operators 

A c = P l AQ, 

B c = P t BP 1 (3.4) 

with, for the finest- level operators, B = I m and C = I n . The coarse- level version of 
fine-level equations (|3.1[) is then given by 

A c v c = a B c u c , 

A\u c = aC c v c . (3.5) 

The intuition behind this approach is as follows: the coarse-level equations can be 
expected to be useful for finding triplet (cr, u, v), since, if (a, u, v) is a singular triplet 
of A and Eqs. (|3.2p are assumed, then (cr, u c ,v c ) is a singular triplet of A c . So one 
can see that, if P and Q can be constructed such that u and v lie exactly in their 
respective ranges (Eqs. p.2[l ), then a coarse-level solve can give us (cr, u, v) exactly. 
The same reasoning applies when coarsening is repeated recursively. Note that the 
B c and C c on all recursive levels are symmetric positive definite (SPD) since the P 
and Q are chosen of full rank. We will now consider methods to build P and Q such 
that u and v lie in their respective ranges approximately. 
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3.2. Generalization of Singular Value Problem. Coarse-level equations 
(|3.5[) are of the form 

A v = a B u, 

A 1 u^gCv, (3.6) 

with B and C SPD. The coarse-level equations motivate the following generalization 
of the singular value decomposition. 

Definition 3.1 (Generalized singular value decomposition). The generalized 
singular value decomposition of A £ M mxn with respect to B £ jftrnxm an( ^ q g 
M nxn , with B and C SPD, is given by 

A = BU YjV 1 C, (3.7) 

with U £ M mXm , V £ M nxn and £ £ M mxn . The columns of U are called the 
left generalized singular vectors, and the columns of V are called the right generalized 
singular vectors. They satisfy the orthogonality relations £/* B U — I m = U BU l and 
V C 'V = I n = V CV . Matrix £ has the I = min(m,n) real nonnegative generalized 
singular values o\ > a% > . . . > ai > on its diagonal. Eqs. VS. 6]) are called the 
generalized singular value problem for matrix A with respect to matrices B and C . 

It is easy to see that the generalized singular triplets (c, u, v) of generalized SVD 
(|3~7]) satisfy Eqs. (f3~6|) . When B = I m and C = I n , generalized SVD (|3J| reduces to 
the standard SVD. 

It has to be remarked that the notion of generalized SVD as defined above is 
different from the more commonly used generalized SVD of A £ M mxn with respect 
to B £ ]R pxn (with m > n), as, for example, defined in [21], p. 471. Definition 13.11 is 
the sense of generalized SVD that we need in this paper. While Eq. (|3.7|) is a natural 
generalization of the singular value decomposition and relates to it in the same way 
the generalized eigenvalue problem (as it is commonly defined) relates to the standard 
eigenvalue problem, we have not been able to find it in the literature yet. In what 
follows, we formulate the properties of the generalized SVD that are useful for the 
calculations to be done in our multilevel cycles. We discuss existence and uniqueness, 
which is important for the well-posedness of our multilevel cycles, and we explain how 
the generalized SVD can be calculated, which we will need to do on the coarsest level 
of our multilevel cycles. 

Theorem 3.2. Generalized SVD \3.7\ ) has the same existence and uniqueness 
properties as the standard SVD. 

Proof. This follows from a simple change of variables: with 

T = B 1/2 U, 

W = C 1/2 V, (3.8) 
D = B- 1 / 2 AC- 1 ' 2 , 

generalized SVD (|3.7| can be rewritten as a standard SVD 

D = TY 1 W t . (3.9) 

□ 

This change of variables provides a first manner of computing generalized SVD 
(13.71) using standard SVD algorithms. An alternative way of computing generalized 
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SVD (|3.7[) proceeds as follows. Let 



X 



Y 



A 

A 1 

B 

c 



It is clear that X is symmetric and Y is SPD, and 

{X - oY) z = 0, 



(3.10) 
(3.11) 

(3.12) 



is a symmetric generalized eigenvalue problem of size (m+n) x (m+n), with m+n real 
eigenvalues o~j and associated eigenvectors [u* k*] 4 , which can be chosen orthonormal 
with respect to Y . The following theorem indicates how the solutions of this gener- 
alized eigenvalue problem can be used to compute the generalized singular triplets of 
generalized SVD (j3T7| . 

Theorem 3.3. Let A g M mxn 7 B e ]R mxm and C g R nxn , with B and C SPD. 
Let I = min(m, n). Then generalized eigenvalue problem 





A 




' B 


" 
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A 1 


— a 
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) 


V 



0, 



(3.13) 



n solution triplets (a,u,v) with linearly independent eigenvectors [u* v*] 1 =/= 0. 

satisfying orthogo- 



has m- 

There are I independent solutions with o~j > and vectors Uj and Vj 



nality relations u* Bui — 6i.j and t>* C v, = 5ij (j 



,1). The triplets (aj, Uj, Vj) 



are the generalized singular triplets of A with respect to B and C . Furthermore, there 
are I independent solutions (—<jj,Uj, —Vj). Finally, there are abs(m — n) = m + n — 21 
independent solutions with a = and either u = or v = 0. 

Proof. This follows directly from the variable transformations Eqs. (|3.9[) . which 
transform generalized eigenvalue problem Eq. (|3.13[) into eigenvalue problem 








D ' 




I'm 
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D* 





— a 





In _ 
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W 



0, 



(3.14) 



which has the properties listed in the theorem, see, for example, [21], p. 427. □ 

A third possible way to calculate generalized SVD (|3.7j) is by solving for the left 
and right generalized singular vectors separately, using 



(A 1 B^ 1 A) v — a 2 C v, 
(AC~ 1 A t )u = a 2 Bu. 



(3.15) 



3.3. Bootstrap AMG V-cycles. In this section, we describe how we use the 

bootstrap AMG approach [5] to find approximations of the desired nf, dominant sin- 
gular vectors and values, and adaptively determine interpolation operators that ap- 
proximately fit the singular vectors. We follow the approach described in [57]. For 
completeness and definiteness, we briefly describe all steps in the process, with some 
details filled in in subsequent sections. 

We first describe the initial BAMG V-cycle. We start out on the finest level 
by choosing nt random test vectors for each of u and v, and we place them in the 
columns of Ut and Vt , respectively. We relax on the test vectors (using a few iterations 
of the SVD power method for Eq. p. II) . see below) such that components with small 
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cr are damped and components with large a become dominant in the test vectors. We 
coarsen the finest grid (see below) and determine interpolation operators P , Q, where 
P fits the vectors in Ut (in a least-squares sense) , such that they lie approximately in 
the range of P, and Q fits the vectors in Vt, such that they lie approximately in the 
range of Q. We also build coarse-level operators A c , B c , and C c according to Eqs. 
(|3.4p . We then restrict the fine- level Ut and Vt (by injection) to the first coarse level, 
and obtain coarse versions of the test vectors, stored in the columns of U c ,t and V c ,t- 
We relax on U C: t and V Ct t with the power method applied to Eqs. p.5[) . The whole 
process of building new, coarser interpolation operators P and Q and operators A c , 
B c , C c , by restricting U c> t and V c j is then repeated recursively, up to some coarse 
level where the problem is small enough for a direct generalized SVD calculation. 

On the coarsest level, we compute n& dominant singular triplets by a direct de- 
composition, and store them in vector <ib and matrices Ub and Vb- These singular 
triplets are the starting approximations for our desired dominant singular triplets, 
and will be improved in this cycle and subsequent cycles. We call the singular vectors 
of these triplets the boot (singular) vectors, and use the subscript b to refer to them. 
(We distinguish these from the initially random test vectors in Ut and Vt, which are 
used to get the process going and sustain it, but do not directly lead to the desired rib 
singular triplets themselves.) Note that we denote by Ob a vector with rib components 
that holds approximations for the dominant singular values sought. 

In the upward phase of the first BAMG V-cycle, starting from the coarsest level, 
we recursively interpolate the boot singular vectors Ub and Vb up to the next finer 
level, using the interpolation operators P and Q of the current level, according to 
multiplicative update formulas Eqs. (|3.2p . On each finer level, we first relax on the 
boot vectors using Eqs. (|3.5|) with the singular values in Ob fixed, and then update 
the elements of Ob by recalculating the Rayleigh quotient for each pair of boot vectors 
(see below). Note that the test vectors U t and V t are not used in the upward phase 
of the V-cycle. 

This initial BAMG multiplicative V-cycle can be followed by several additional 
multiplicative V-cyclcs. In the downward sweep of each of these additional cycles, 
one relaxes Ut and Vt as in the first V-cycle. In addition, one also relaxes the Ub and 
Vb, and improves the <7t> on each level, as in the upward sweep of the first cycle. At 
each level, the vectors in both Ut and Ub are used to fit P, and the vectors in both 
Vt and Vb to fit Q. Then A c , B c and C c are also rebuilt using the new P and Q. The 
upward sweeps of the additional multiplicative cycles are the same as in the initial 
multiplicative cycle. At the end of every V-cycle, we optionally also apply a collective 
Ritz projection step (see below) to improve the boot vectors Ub, Vb and singular value 
approximations <Jb- We do so for the numerical tests reported in Sec. [6] 

Note that in this paper we use only the simplest type of multilevel cycles, namely, 
V-cycles. More sophisticated cycles including W-cycles and full multigrid (FMG) 
cycles j!5l HJ H21 [37J can be considered and may lead to improved results, but for 
simplicity we only use V-cycles here. In the following sections we will give the details 
of the relaxation schemes, coarsest-level solve, coarsening and interpolation used in 
our BAMG cycles. 

3.4. Relaxation Scheme for the Test Vectors. Seeking dominant singular 
triplets, we base relaxation for the initially random test vectors on the power method 
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applied to Eq. (|3.6p . On any level, given an initial Uj, we solve for vj from 

A 1 uj =Cvj, 

Vj = Vj/{v]Cvjfl\ (3.16) 



and then for Uj from 



Avj = Buj, 



Uj = Uj/^jBuj) 1 ' 2 . (3.17) 

This can be repeated fit times on each level. In practice, we solve for the new Vj and 
Uj in an inexact way, by performing \i t j inner iteration steps of weighted Jacobi with 
weight ujj. For example, for Eq. (|3.16[) we iterate on: 

= _ UjD -i _ A t Uj) (3 lg) 

with Vj = Vj initially and with the iteration index of the weighted Jacobi procedure 
indicated in superscript. Here, Dc is a diagonal matrix with the diagonal of the SPD 
matrix C on its diagonal. In the numerical results reported in Sec. [51 we use ujj = 0.7 
and [it, j = 1- 

3.5. Relaxation Scheme for the Boot Vectors and Update Formulas for 
the Singular Values. For the boot vectors, we relax on 

Av = a Bu + k, (3.19) 
A*u = aCv + r. (3.20) 

(Note that in the multiplicative phase K = and r = on all levels, but the additive 
phase will require nonvanishing n and r, so we already include them in the formulation 
here.) On any level, given an initial Uj, Uj and Vj, we solve for a new Uj from Eq. 
(|3.19p . and then for a new Vj from Eq. (|3.20[) . This amounts to a block Gauss-Seidel 
(GS) scheme for equation system (|3Tl9"]l - (|3T2"0j) . For dominant as, Eqs. ([339]) - ([330]) 

(X-aY) [«* «*]* = faV]*, (3.21) 

may be close to diagonally dominant, so this will work well in many cases. For some 
problems or on coarser levels, the block GS approach may not converge well, and 
Kaczmarz relaxation [36l [27] (see also below) on Eq. (|3. 211) or its blocks may be 
preferable. In our block GS approach, we again approximate the solutions of Eqs. 
(I3.19[) - (|3.20[) in an inexact way, by performing fi^j inner iteration steps of weighted 
Jacobi. For example, for Eq. (|3.19p we iterate on: 

uf +1) = uf - wj D B X (B uf - (Avj - K)/aj). (3.22) 

In the numerical results reported in Sec. [6l we use fx^j = 1. In the multiplicative 
phase, with k = 0, r = on all levels, we also update the as after every outer 
relaxation iteration on each level. The easiest way to do this is to use Rayleigh 
quotient formula 

u f Av 

a = (utBu) 1 / 2 (vtCv) 1 / 2 ( ' ' 

for each boot singular triplet, which is what we do in the numerical results presented 
in Sec. [5] 
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3.6. Coarsest-grid Solution. Each time the coarsest level is reached, we de- 
termine new approximations for the coarsest-level boot triplets by direct compu- 
tation of the coarsest- level generalized SVD, Eq. (|3.7p . The rib singular triplets with 
the largest singular values are selected as the new boot singular triplets. In our imple- 
mentation, we choose to solve generalized eigcnproblem (|3.13j) of Theorem 13.31 using 
a direct eigendecomposition algorithm. 

3.7. Building P and Q: Coarsening and Sparsity Patterns. In order to 
build interpolation operators P and Q, at each level, we first coarsen the sets of 
unknowns in u G M rn and v G JR n by choosing a set of m c coarse-grid variables, 
C u , out of the m fine- level variables for u, and by choosing a set of n c coarse-grid 
variables, C v , out of the n fine-level variables for v. The coarse variables are called 
coarse-grid points or C-points. The fine-level w-variables that are not selected as C- 
points are called F-points and are denoted by the set F u . Similarly, the F-points of 
the fine- level w-variables are denoted by F v . Well-known algorithms from AMG are 
used to determine C u and C v and the sparsity patterns of P and Q on each level, 
based on the idea of strength of connection in the operator matrices A. After this 
coarsening process, the matrix elements of P and Q are determined using a least- 
squares approach in such a way that the test vectors U t and V t (and, after the initial 
cycle, also the boot vectors [/& and Vj,) lie approximately in the ranges of P and Q, 
respectively. 

For the coarsening process for the u-variables, we propose to apply standard AMG 
coarsening methods to matrix A A 1 , and we base coarsening of the i;- variables on A t A. 
(If A is square or square and symmetric, other choices can be made, see below.) 

Algorithm 1: one-pass Ruge-Stueben coarsening algorithm 
set U <— all fine-level points; C <— empty; F <— empty; 

for all fine-level points i, set A, <— number of points strongly influenced by i; 
while U not empty do 

select one of the i G U that has a maximal A*; 

make i a C-point (C = C U i, U = U 

make all j £ U that are strongly influenced by i, new F-points (F = F U j, 



increment A& for all k G U that strongly influence the new F-points j; 
end 



We implement coarsening as follows. For the it-variables we employ the standard 
one-pass Rugc-Stucben coarsening algorithm [35] (see Algorithm[T]) on N — AA l using 
strength of connection condition 



with < 9 < 1 a fixed strength parameter that may be chosen dependent on the 
problem. (The (i, j) matrix element of N is denoted by riij.) For diagonally dominant 
PDE discretizations, strength is often determined relative to the largest off-diagonal 
element in row i, using condition \riij\ > 6 max^i We, however, target a 

broader class of problem matrices, and opt for strength condition (|3.24j) . which is 
somewhat more general. Note, however, that the magnitude of strength parameter 



U = U\j); 




(3.24) 




A- 
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typically needs to be chosen differently in the two approaches. For the w-variablcs, 
we determine strong connections in the same way, for matrix A 1 A. Once the strong 
connections are determined, coarsening can be performed: Algorithm [1] is executed 
to determine sets of C-points and F-points for the it-variables and the -(/-variables. 

In a next step, first for the u- variables, we determine, for each F-point i in F u , a 
coarse interpolatory set which contains all C-points (points in C u ) that strongly 
influence point i according to condition (|3.24[) in AA l . The coarse interpolatory sets 
C\ of the w-variable F-points are determined in the same way based on A 1 A. This 
defines the sparsity patterns of the interpolation operators P and Q. We explain this 
for P, and it is analogous for Q. For each C-point in C u with fine-level index i, we 
let a(i) be the index of point i on the coarse level. For all points i in C u , row i in P 
is zero, except for Pi ;a u-) = 1. For all F-points i in F u , row i in P is zero, except for 
matrix elements Pi a (j) where j is an element of i's coarse interpolatory set C' l u . 

Basing coarsening of the w-variables and the v- variables on AA t and A t A, respec- 
tively, can be motivated by the observation that, on the finest level, the left singular 
vectors are eigenvectors of A A 1 , and the right singular vectors are eigenvectors of A t A. 
Moreover, AA l and A 1 A are symmetric matrices, and AMG was built for that type 
of matrices. In that sense, using AA l is a natural choice for measuring connection 
strength between it-variables. Also, forming AA l can be done in 0(m) (assuming 
m > n) time for large classes of sparse matrices, so it does not overly add to the cost 
of our method. Note also that we only use AA l for coarsening, and not in the rest of 
the algorithm, so there is no deterioration in terms of condition numbers, which is a 
major reason to avoid calculating the left singular vectors as the eigenvectors of AA f , 
and the right singular vectors from A t A). Note also that Eqs. (|3.15p suggest basing 
coarsening on A 1 B~ x A and AC -1 A 1 on coarser levels rather than A 1 A and AA l , 
but we normally choose to ignore the B~ x and C _1 mass matrix factors to avoid the 
extra matrix inversion and matrix product. 

For some applications, however, it may be possible to devise good coarsening 
schemes for u and v directly from the rectangular matrix A, by considering its rows 
and columns. We expect, however, that the details and success of such strategics 
may be highly dependent on the type of problem, and direct coarsening methods for 
row-variables and column-variables of rectangular matrices is kept as an interesting 
topic of further research. 

3.8. Building P and Q: Least-Squares Determination of Interpolation 
Weights. We use a least-squares (LS) process to determine the interpolation weights 
in the rows of P and Q that correspond to F-points, following the approach in [51 127]. 
Again, we explain the process for matrix P, and it is analogous for Q. We want to 
fit the interpolation weights of P such that the n t current fine-level test vectors Ut 
and the rib current boot vectors Ub (except in the first cycle) lie approximately in the 
range of P. Let Uf hold in its columns the nt = n t + rib vectors to be fitted. Let Uk be 
the fcth vector in Uf. Let iik, c be the coarse- level version of Uk obtained by injection, 
and let u 3 k c be its value in coarse-level point j. Also, let u\ be the value of Uk in 
fine-level point i. The weights of each F-point row in P are determined consecutively 
using independent LS fits. Consider a fixed F-point with fine-level index i (the row 
index of P). Its coarse interpolatory set is C l u , and we assume now that the points 
in C l u are labeled by their coarse- level indices (the column indices of P). Let n c ,i be 
the number of elements of C l u . For each F-point i we solve the following least-squares 
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problem to determine the unknown interpolation weights Pij: 

4 = Eftj'i (k = l,...,n f ). (3.25) 

This is a system of n/ equations in n c ,i unknowns. We make this system overdeter- 
mined in all cases by choosing the number of initially random test vectors, nt, larger 
than the expected largest interpolation stencil size n c j for any i on any level. (This 
is one of the criteria guiding the choice of nt, and, in our implementation, estimating 
n t too small initially may require a restart of the method with a larger n t ). Since 
we would like the dominant boot vectors to be fitted preferentially as soon as they 
become reasonable approximations, we weight the kth equation in Eq. (|3.25[) by the 
Raylcigh quotient, (|3.23[) . of the pair (uk, Vk), see also |27j . In our implementation, 
we solve the LS problem using a standard normal equation approach. Finally, we 
mention that we use a modification of Eq. (|3.25|) for the case of minimal singular 
triplets or eigenpairs, as proposed in [30j . For these cases, interpolation weights and 
convergence can be improved significantly by applying an extra fine-level Jacobi re- 
laxation (using the operator we base strength on) to the F-point values u\ in Eq. 
(|3.25p (but not the C-point values u 3 k ), see [30] for further details. We have found in 
our numerical experiments that this modification is not useful when seeking dominant 
singular triplets or eigenpairs. 

4. AMG SVD Algorithm: Additive Phase. In the additive (solve) phase 
of our algorithm, we use fixed interpolation and coarse- level operators, namely, the 
operators P. Q, A c , B c and C c as they were determined on all levels in the last 
mutiplicative cycle, and use an additive correction scheme to improve the boot 
singular triplets that came out of the multiplicative (setup) cycle at the finest level. In 
each iteration of the additive phase, for each of the finest-level aj, Uj, Vj (1 < j < rtj,) 
in <Jb, Ub, Vt, we first improve Uj and Vj in a classical- type additive AMG V-cycle 
with aj fixed in the whole cycle. Then, after all the Uj and Vj have been updated 
using one V-cycle for each pair, we collectively improve all the aj, uj and vj in <Jb, 
Ub, Vb using a Ritz projection step on the finest level. These multigrid-Ritz iterations 
are repeated until the desired accuracy is reached. Our solve phase is similar to the 
approach described by Borzi and Borzi in [4] for calculating minimal eigenpairs of 
an SPD matrix using standard AMG interpolation operators (it is also described in 
[27], but not combined with a multiplicative phase). We now extend this approach 
to the calculation of dominant SVD triplets using the self-learned operators from the 
multiplicative phase of the algorithm. 

4.1. Coarse-level Equations. In the additive correction scheme, the equations 
for triplet (aj, Uj, Vj) on the current level are given by 

Av 3 -a 3 Bu 3 

A*u j -tT j Cv j =T j , (4.1) 

where Kj and Tj are the residuals restricted down from the next finer level. (So Kj = 
and Tj = on the finest level.) 

The equations on the next coarser level arc then 



■ I Vj iC - aj B c Uj^ c = I'' rj, 
A\uj^ c - a j C c v j)C = Q* Sj, 



(4.2) 
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where Tj and Sj are the residual vectors of the first and second fine-level equations, 
respectively, and the coarse-grid correction equations are given by 



where the superscript (i) means the ith iterate. Note that itj c and Vj. c now represent 
coarse-level additive errors of the fine-level quantities Uj and Vj . Rather than using 
new variable names to distinguish original variables an their coarse-level errors, we 
follow the convention that is common in the multigrid literature [15] to refer to vari- 
ables and their coarse-level errors with the same letter from the alphabet, which aids 
in presenting the algorithm in a recursive way. 

Note that for the eigenvalue solvers in [?J [37] the additive method is described in 
the framework of the full approximation scheme (FAS), like in the paper in which the 
general ideas of this approach were originally proposed [5] , where the FAS framework 
was required because eigenvalue approximations were modified on the coarsest level 
of each cycle. However, in the additive methods in [U [57J, eigenvalues remain fixed 
for the entire additive cycle, so there is no need to use the FAS, and the simpler error 
equation formulation that is common in multigrid for linear operators can be used 
instead, which is what we do in our discussion here. 

4.2. Additive V-cycles to Improve the Left and Right Singular Vectors. 

For each of the finest-level aj, Uj, Vj (1 < j < rib) in &b, Ub, Vb, we fix aj and perform 
an additive V-cycle as follows. We relax the singular vectors Uj and Vj using Eq. (|4.ip 
on the finest level, with the relaxation method that was described in Sec. 13.51 We 
calculate the residuals Kj and Tj , and restrict them to the next coarser level. We then 
choose a zero initial guess for Uj. c and Vj )C and relax them using coarse equations (|4.2I) . 
we calculate the coarse residuals, restrict them to the next coarser level, etc. This is 
repeated recursively up to some coarse level where the problem is small enough for 
a direct solve. On the coarsest level, we solve Eq. (|4.2[) exactly for vector [tij c ^ J* 
(as in Eq. (|3.21[l ). To make the coarse- level solve somewhat more robust when the 
operator is close to singular, one can optionally use the pseudo-inverse (calculated via 
the SVD) of X — a Y without including the component corresponding to its smallest 
singular value, as suggested in [55] • We do so in the numerical results presented in 
Sec. We then interpolate the coarsest-grid solution up, correct using Eqs. (|4.3[) , 
relax the corrected vectors, interpolate up again, etc., recursively until the finest level. 

4.3. Ritz Projection Step on the Finest Level to Improve the Boot 
Singular Triplets. After carrying out one V-cycle for each of the rib boot singular 
triplets, we perform a Ritz projection step, as in [4l I27j. An alternative would be to 
update each Oj in &b using Rayleigh quotient formula (|3.23[) . However, a collective 
Ritz step leads to faster overall convergence, and has other important advantages. For 
singular values with multiplicity larger than one, it provides orthogonal singular vec- 
tors, and it precludes convergence of some of the triplets in the finest-level <7b, Ub and 
Vb to spurious duplicate triplets, which may occur with the cts updated individually 
according to Eq. p.23[) . 

The Ritz step proceeds as follows. We first orthogonalize the columns of Ub with 
respect to B using the QR decomposition, and we orthogonalize the columns of Vb 
with respect to C. (Note that B = I m and C = I„ on the finest level, but, in the 
multiplicative phase, the Ritz procedure can in principle also be employed on coarser 
levels, so we prefer to give the more general equations here.) Let U and V be the 



u 



(i+l) 



= uf + P uj, 
= vf ) +Qv j , 



,(<+!) 
3 



(4.3) 
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orthogonalizations of Uf, and V&, and let U = span(CT) and V = span(V"). We seek 
new Uj € U, Vj € V, and <jj (1 < j • < n^) such that 

(u, A Vj — a j B Uj) B = Vu e U, 

(v, A* Uj - a 3 Cvj) c = to 6 V. (4.4) 

These equations express that the residuals are desired to be orthogonal (i?-orthogonal 
and C-orthogonal, respectively) to the spaces U and V in which we seek an improved 
approximation. Eq. (|4.4[) can be expressed in terms of new variables y, yj G M rnc and 
z, Zj € J?™ c with u = Uy,v = Vz,Uj=Uyj and Wj = V"^-, as 

[y,U l AV Zj-ajU 1 BUy^ = Vy e R m % 
'z^Atfjyj -Vj V l CVz^ = VzeJR"'. (4.5) 
The following generalized eigenvalue problem of size 2 n& x 2 n& results 

C/'B £> 



U l AV 
V 1 A l XJ 



o ytcv 



) 




= 









(4.6) 



According to Theorem 13.31 the eigenvalues of Eq. (|4.6[) occur in pairs symmetrically 
about zero, and it is sufficient to consider the triplets (aj, yj, Zj) with the largest 
values for tjj to generate new approximations (<7j,U yj ,V Zj) for the dominant singular 
triplets on the finest level. 

Note finally that, unlike the multiplicative cycles, the multigrid-Ritz additive 
iterations can converge to any required accuracy, even though, on each level, the 
Uj are not exactly in the range of the Ps, and the VjS are not exactly in the range 
of the Qs. In practice, as demonstrated in the numerical tests below, the hybrid 
multiplicative-additive scheme converges up to machine accuracy if desired. 

5. AMG SVD Algorithm: Specialization and Extension. In this section 
we discuss the specialization of the dominant singular triplet algorithm for rectangular 
matrices to the case of square matrices and symmetric matrices (dominant cigenpairs), 
and its extension to the case of minimal singular triplets (and eigenpairs) . 

5.1. Singular Triplets of Square Matrices. A possible simplification for 
square, nonsymmetric matrices is that interpolation operators P and Q could po- 
tentially be based on A and/or A'; it does not appear to be necessary to form AA* 
and A* A, so that cost may be saved. Interestingly, if one wants to keep square matri- 
ces on all levels, coarsening and sparsity patterns for P and Q should both be based 
on either A or A* , because coarsening of A and A* may lead to different numbers of 
coarse grid points (except if a coarsening method is used that is symmetric). If the 
left and right singular vectors arc expected to be very similar such that they can all 
be fitted with reasonable accuracy by one interpolation operator, P and Q could even 
be taken the same on all levels ; in that case it would also hold that B c = C c on all 
levels, which can be exploited for further cost savings. 

5.2. Eigenpairs of Symmetric Matrices. In the case of symmetric matrices, 
the whole algorithm simplifies significantly, and becomes a combination of the minimal 
SPD eigenpair algorithms of [4] and [27], extended to dominant eigenpairs. The 
resulting algorithm can be formulated in terms of operators A, B and P on all levels. 
This combination of a multiplicative and an additive scheme into a hybrid method 
for eigenpairs has the advantages that it can converge up to machine accuracy for 
multiple eigenvectors with one P, and that it is self-learning. 
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5.3. Minimal Singular Triplets and Minimal Eigenpairs. With just a few 
small modifications, the hybrid multiplicative-additive dominant singular triplet al- 
gorithm described above can also be used to compute the rib singular triplets with 
smallest singular values. All that is required is to modify the relaxation schemes, and 
to select the smallest singular triplets as new boot singular triplets in the coarsest- 
level solve of the multiplicative phase. The weights in the LS fitting of the test and 
boot vectors is taken as the inverse of the Rayleigh quotient, see also [23 [TOl [3] . For 
the relaxation of the n t initially random test vectors in Ut and VJ, we iterate on Eqs. 
(|3.6I) with cr = using Kaczmarz relaxation (see [SHI HZ]). Richardson iteration as in 
[13j can be considered as another option for relaxation. For the relaxation of the n& 
boot vectors in Ub and Vb, we iterate on Eqs. (|3.19[) - (|3.20[) (with the small as from <7ft) 
in a block GS fashion using Kaczmarz relaxation [5^1 HZ| for the blocks. Numerical 
tests show that these Kaczmarz relaxations may sometimes result in singular vector 
pairs that produce a negative Rayleigh quotient. We test for this and reverse the sign 
of one of the singular vectors if this happens. In the case of minimal eigenpairs of 
symmetric matrices, GS relaxation on A x = can be used, with Kaczmarz on coarser 
levels, see [57J. In the numerical results reported below, we use Kaczmarz relaxation 
on all levels when seeking minimal singular triplets or eigenpairs. Note also that, 
since our method is self-learning, the minimal SPD eigenpair problem can in principle 
also be solved simply by shifting the operator such that the spectrum ends up at the 
other side of the origin, and then the algorithm for dominant eigenpairs can be used 
(and vice versa). 

6. Numerical results. In this section, we present numerical results illustrating 
how our proposed method performs. We discuss four different test problems that 
cover the different cases of rectangular matrices, square nonsymmetric matrices, and 
symmetric matrices. 

6.1. High-Order Finite Volume Element Laplacian on Unit Square. In 

the first test problem, we seek a few extremal singular triplets of a square, nonsym- 
metric matrix that results from a finite volume element (FVE) discretization with 
quadratic polynomials of the standard Laplacian operator on the unit square with 
Dirichlct boundary conditions, see [391 [T]. The operator is discrctized on a structured 
triangular grid. For this problem, the FVE method with linear polynomials gives a 
discretization that is exactly the same as the Galerkin finite element discretization 
with linear polynomials. For higher orders, however, the FVE discretization is slightly 
non-symmetric . 

Figs. 16.11 and 16.21 show convergence results for approximating the largest and 
smallest singular values, respectively, for a matrix with m = n = 961 (31 x 31 internal 
grid points). We show the base-10 logarithm of the relative error in the calculated 
singular values 

Inexact &approx\ ,\ 

error = — , (6.1) 

^ exact 

as a function of the number of V-cycles. Here, the values a exa ct are high-accuracy 
approximations obtained by Matlab's built-in SVD algorithms. There are 10 mul- 
tiplicative (setup) cycles followed by 30 additive (solve) cycles. We have calculated 
rib = 8 dominant or minimal singular triplets, using nt = 5 initially random test 
vectors. We used fXt = 4 relaxations on the test vectors, and [ib = 4 relaxations 
on the boot vectors, on all levels. The coarsening strength parameter was chosen as 
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Fig. 6.1. Largest Singular Values for High-Order Finite Volume Element Laplacian on Unit 
Sguare (square, nonsymmetric matrix). Convergence plot for calculation of the eight largest singular 
values (base-10 logarithm of relative error in singular values as a function of number of V-cycle 
iterations). Singular values are labeled with decreasing magnitude (label 1 denotes the largest singular 
value). The 10 V-cycles to the left of the vertical line are multiplicative, and the 30 V-cycles to the 
right of the vertical line are additive. 



6 = 0.05. Coarsening and sparsity patterns for both P and Q are determined using 
A, thus guaranteeing square matrices A on all levels. 

The figures show that the extremal singular triplet algorithm carries out the task 
that is was designed for: it collectively calculates several singular values up to machine 
accuracy in a modest number of multigrid V-cycles, and this both for the dominant 
triplet and the minimal triplet case. The initial, multiplicative phase approximately 
determines singular triplets starting from initially random test vectors, but conver- 
gence stagnates after a few operations because it is limited by the accuracy by which 
the singular vectors arc represented collectively by single interpolation operators. A 
second, additive phase succeeds in driving the error to machine accuracy, using the 
(fixed) interpolation operators that were derived in the last multiplicative iteration. 
This shows that the approach is able to fit interpolation to the relevant vectors both 
for the cases of dominant and minimal triplets. 

For conciseness, we will limit ourselves to plot the relative errors in singular 
values or eigenvalues in this paper. Convergence of these properties goes along with 
high-accuracy convergence of other quantities like residuals, angles between exact and 
approximate singular vectors, orthogonality measures between singular vectors, etc. 
All these quantities also converge with high accuracy in our numerical tests, but they 
are not shown for conciseness. Since our code is implemented in Matlab and is not 
optimized, we do not directly compare with other, optimized codes in terms of CPU 
time, but instead focus on reporting convergence numbers as a function of numbers 
of V-cycle iterations, which gives valuable insight in the effectiveness of our method, 
since the cost of a V-cycle is approximately linear in the number of unknowns, m + n. 

For the case of dominant singular triplets (Fig. I6.1[) . the calculation uses four 
levels, with coarsest size 45 x 45. For the case of minimal singular triplets, five levels 
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were obtained, with a coarsest grid of size 51 x 51. Sec Table RTT1 for approximations 
of the singular values calculated. It can be seen that the singular values lie very close 
to each other, which makes this a difficult type of problem for many iterative singular 
value decomposition algorithms. Nevertheless, our algorithm converges to machine 
accuracy in a moderate number of V-cyclcs. Note also that the non-symmetry of 
the discrete operator has lifted the degeneracy of the continuous operator, which has 
eigenvalues with multiplicity larger than one; no singular values with multiplicity 
larger than one arise. 
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Fig. 6.2. Smallest Singular Values for High-Order Finite Volume Element Laplacian on Unit 
Square (square, nonsymmetric matrix). Convergence plot for calculation of the eight smallest singu- 
lar values (base-10 logarithm of relative error in singular values as a function of number of V-cycle 
iterations). Singular values are labeled with increasing magnitude (label 1 denotes the smallest sin- 
gular value). The 10 V-cycles to the left of the vertical line are multiplicative, and the 30 V-cycles 
to the right of the vertical line are additive. 
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Table 6.1 



Singular values and eigenvalues sought for each problem (high- accuracy approximations) . 



6.2. Finite Difference Laplacian on Unit Square. We now consider the 
case of a simple finite-difference (FD) Laplacian with Dirichlet boundary conditions 
discretized with a 5-point stencil on a unit square with a Cartesian grid. This leads to 
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Fig. 6.3. Smallest Eigenvalues for Finite Difference Laplacian on Unit Square (square, sym- 
metric matrix). Convergence plot for calculation of the eight smallest eigenvalues (base- 10 logarithm 
relative error in eigenvalues as a function of number of V-cycle iterations). Eigenvalues are labeled 
with increasing magnitude (label 1 denotes the smallest eigenvalue). The 15 V-cycles to the left of 
the vertical line are multiplicative, and the 30 V-cycles to the right of the vertical line are additive. 

a symmetric matrix (it is SPD), and we seek minimal and dominant eigenpairs. We 
use strength of connection 9 = 0.06 and seek n& = 8 minimal or dominant eigenpairs, 
using n t = 6 initially random test vectors. We used jit = 8 relaxations on the test 
vectors, and /ib = 4 relaxations on the boot vectors. We perform 15 multiplicative 
cycles followed by 30 additive cycles. The problem size is m = n = 1024 (32 x 32 
internal grid points). Table ItTTl shows that there are eigenvalues with multiplicity 
larger than one for this symmetric discretization. 

Fig. 16.31 shows convergence results for the case of minimal eigenpairs. Five levels 
are used and the coarsest grid is of size 64 x 64. These results can be compared with 
the results of the additive-only eigenvalue method of Borzi and Borzi ([4]) and the 
multiplicative-only eigenvalue method of Kushnir, Galun and Brandt ([27]). Our ad- 
ditive phase is like the method in [4], but in that paper standard AMG interpolation 
is used. We appear to get similar results, but our method is more general and can 
also be applied to seeking dominant eigenpairs and to a wider set of problems due to 
its self- learning capacity. Our multiplicative phase is like the method in [57]. We see 
that convergence stagnates at the level of accuracy by which interpolation collectively 
represents the desired eigenvectors. (Note that in our combined algorithm it would 
have been sufficient to perform less than 15 multiplicative cycles.) In |27] interpo- 
lation is made more accurate to improve the accuracy level at which the collective 
multiplicative phase stagnates. As explained in that paper, the accuracy that can 
be obtained in this way may be sufficient for some applications, for example, due to 
unavoidable discretization errors in PDE problems, or due to data and model un- 
certainties in data analysis tasks. In our approach, we show that, if desired, higher 
accuracy can be obtained by combining the multiplicative and additive approaches, 
resulting in a method that is flexible enough to deal efficiently with a variety of prob- 
lems due to its self-learning capabilities. Fig. l6.4| gives convergence results for the case 



22 




15 20 25 
V-cycles 



Fig. 6.4. Largest Eigenvalues for Finite Difference Laplacian on Unit Square (square, symmet- 
ric matrix). Convergence plot for calculation of the eight largest eigenvalues (base-10 logarithm of 
relative error in eigenvalues as a function of number of V-cycle iterations). Eigenvalues are labeled 
with decreasing magnitude (label 1 denotes the largest eigenvalue). The 15 V-cycles to the left of 
the vertical line are multiplicative, and the 30 V-cycles to the right of the vertical line are additive. 



of dominant eigcnpairs. Four levels are used and the coarsest grid is of size 52 x 52. 
The results show that our hybrid multiplicative-additive method can also compute 
dominant eigcnpairs, extending the approaches for minimal eigenpairs from [H I27j 
to dominant eigenpairs. Convergence in the additive phase appears somewhat slower 
than for the minimal eigenpairs case. This may be due to the fact that we employ 
Kaczmarz relaxation for the minimal eigenpairs, which is more efficient but also more 
expensive than the inexact power method relaxation used for the dominant eigcnpairs 
case (Sec. 13.5)) . It is interesting to note that the approach in [4] which uses standard 
AMG interpolation, can also be extended to calculating dominant eigcnpairs simply 
by changing the signs of all off-diagonal interpolation weights. The resulting interpo- 
lation operators turn out to be good fits for the most oscillatory modes, and can be 
used in an additive scheme to approximate the dominant eigcnpairs. 

6.3. Planar Random Triangulation Graph Laplacian. The next test prob- 
lem is the graph Laplacian operator of a planar random graph that is obtained by 
placing points uniformly random in the unit square and determining their Dclauncy 
triangulation graph. With A the adjacency matrix of the graph, the graph Lapla- 
cian, A, can be constructed by setting A = —A and placing the row sums of A on 
the diagonal. This results in a symmetric semi-definite matrix (it has one vanishing 
eigenvalue), and we seek dominant and minimal eigenpairs. This problem is interest- 
ing as a test case because it is unstructured, contrary to the previous two problems 
which derive from structured grids. Graph Laplacian matrices are of interest in data 
analysis tasks [27]. We use strength of connection 9 = 0.05 and seek nt = 6 dominant 
or minimal eigenpairs, using n t = 6 initially random test vectors. We use \it = 1 
relaxations on the test vectors, and fib = 8 relaxations on the boot vectors. We 
perform 10 multiplicative cycles followed by 30 additive cycles. The problem size is 
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Fig. 6.5. Smallest Eigenvalues for Planar Random Triangulation Graph Laplacian (square, 
symmetric matrix). Convergence plot for calculation of the six smallest eigenvalues (base- 10 loga- 
rithm of relative error in eigenvalues as a function of number of V-cycle iterations). Eigenvalues 
are labeled with increasing magnitude (label 1 denotes the smallest eigenvalue) . The 10 V-cycles to 
the left of the vertical line are multiplicative, and the 30 V-cycles to the right of the vertical line are 
additive. 




V-cycles 

Fig. 6.6. Largest Eigenvalues for Planar Random Triangulation Graph Laplacian (square, 
symmetric matrix). Convergence plot for calculation of the six largest eigenvalues (base- 10 logarithm 
of relative error in eigenvalues as a function of number of V-cycle iterations) . Eigenvalues are labeled 
with decreasing magnitude (label 1 denotes the largest eigenvalue) . The 10 V-cycles to the left of 
the vertical line are multiplicative, and the 30 V-cycles to the right of the vertical line are additive. 



m = n= 1024. 

Fig. 16. 51 shows convergence results for the case of minimal eigenpairs. Three levels 
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Fig. 6.7. Same as Fig. \ 6.6\ but during the additive phase, whenever one or more of the eigen- 
values reach a relative error converge tolerance of le-14, the interpolation operators are redetermined 
and preferentially fitted to the unconverged eigenpairs. This improves the convergence of the eigen- 
pair that is slow to converge in Fig. \ 6. 6\ 



are used and the coarsest grid is of size 59 x 59. The operator is shifted by 0.01 to avoid 
problems in representing the relative error in the smallest eigenvalue (which vanishes 
for the unshiftcd operator) . Fig. 16.51 shows satisfactory convergence behavior, but 
convergence in the additive phase is not as good as for the finite difference Laplacian 
on a structured grid (Fig. 16. 3j) . even though we doubled fib to 8. This is most likely 
due to the fact that the minimal eigenvectors of the unstructured problem are less 
regular and less similar to each other, such that they are not represented as well by 
the collective interpolation operators. For this reason, we only sought six eigenpairs 
for this problem. We reduced the number of test vector relaxations because the 
eigenvalues are less clustered for this problem, and too many test vector relaxations 
quickly make the set of test vectors too linearly independent for the LS fits. Fig. 16.61 
gives convergence results for the case of dominant eigenpairs. Three levels are used 
and the coarsest grid is of size 77 x 77. It can be seen that the algorithm converges 
slowly for the sixth eigenpair. When one or more of the eigenpairs sought converge 
significantly more slowly than the others, the following strategy can be followed to 
improve their convergence. In the additive phase, once some eigenpairs have converged 
beyond a pre-spccified tolerance, one can redetermine the interpolation operators in a 
way to preferentially fit the eigenpairs that have not converged yet. Fig. l6.7l shows that 
this can improve the convergence of lagging eigenpairs. For the convergence curves 
shown in Fig. 16.71 whenever one or more of the singular values reach a relative error 
converge tolerance of le-14, we redetermine the interpolation operators (basically, by 
executing one downward sweep of the multiplicative phase), and reduce the weight 
of the already converged boot vectors and the test vectors by a factor of 1 000 in the 
LS fitting process. This can speed up the convergence of the remaining eigenpairs, as 
shown in Fig. 16.71 
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6.4. Medline Term-document Matrix. The final test matrix is a real term- 
document matrix, namely, the MEDLINE data set downloaded from the Text to 
Matrix Generator website ( |http://scgroup20.ceid.upatras.gr:8000/ tmg). The rows of 
this matrix represent terms and the columns represent documents. Matrix element 

counts how many times term i occurs in document j. The matrix is sparse 
(less than 1% nonzeros). Latent semantic indexing determines concepts in documents 
by calculating dominant singular triplets of term-document matrices [16] , so we seek 
to compute dominant singular triplets. We consider a rectangular submatrix of size 
m = 5 735, n = 1033. We use strength of connection 9 = 0.03 and seek n\, = 8 
dominant singular triplets, using n t = 14 initially random test vectors. We used 
fi t = 1 relaxations on the test vectors, and fib = 4 relaxations on the boot vectors. 
We perform 3 multiplicative cycles followed by 30 additive cycles. 

Fig. !6.8l shows convergence results for approximating the eight dominant singular 
triplets. The calculation uses five levels, and the coarsest grid is of size 415 x 198. The 
figure shows that our method is successful in calculating the eight dominant singular 
triplets, with good convergence and high accuracy. The importance of this proof-of- 
concept calculation is that it indicates that our approach is flexible enough to deal 
with this kind of problem that is new to multigrid (as far as we are aware). The 
self-learning feature of our method is able to adapt to the singular vectors that are 
relevant in this application, which is interesting by itself, since our development is an 
extension of algebraic multigrid concepts that were developed for PDEs, in which the 
relevant vectors are of a different nature. Similarly, we have obtained the result in 
Fig. l6.8l using a standard PDE-oriented AMG coarsening approach, and obtain results 
that appear to converge quite satisfactorily. It has to be noted, though, that the 
dominant singular values of term-document matrices may have larger gaps (see Table 
I6.1|) . especially for the very largest ones, which may make these problems somewhat 
easier for iterative methods than, for example, the FVE problem of Sec. 16.11 which has 
small gaps between the dominant (and minimal) singular values that decrease with 
increasing problem size. While we expect our method to be competitive for the latter 
type of problems, it remains to be investigated in future work how competitive our 
general approach can be made for problems like term-document matrices. For one, it 
would require to consider dedicated special-purpose coarsening methods. (We have 
already developed such special-purpose coarsening mechanisms for certain scale-free 
graphs, see [17], and see also [9j [31] for promising more general approaches.) In the 
case of rectangular matrices, it may be possible to come up with methods to coarsen 
the row and column variables based on A and A t directly (rather than using AA t 
and A 1 A), which may be feasible for some applications, guided by the application- 
dependent interpretation of the variables and operator matrix coefficients, and is kept 
for future work. Nevertheless, the proof-of-concept results presented here already 
show promise and illustrate the versatility of our general approach to calculating 
singular triplets. 

6.5. Discussion. The above numerical results show that the proposed combined 
multiplicative-additive approach is successful in calculating extremal singular triplets 
and eigenpairs, with high accuracy obtained in a modest number of V-cycles for a 
variety of problems. However, more research needs to be done to make the method 
more black-box and robust. There are quite a few parameters to be chosen, and suc- 
cess is sometimes sensitive to careful choice of these parameters. This is not unlike 
the situation that existed for AMG as a linear system solver early on in its devel- 
opment for that purpose; it took many years of concerted effort for AMG to ripen 
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Fig. 6.8. Largest Singular Values for Medline Term- document Matrix (rectangular). Conver- 
gence plot for calculation of the eight largest singular values (base-10 logarithm of relative error 
in singular values as a function of number of V-cycle iterations). Singular values are labeled with 
decreasing magnitude (label 1 denotes the largest singular value). The 3 V-cycles to the left of the 
vertical line are multiplicative, and the 30 V-cycles to the right of the vertical line are additive. 



to the successful linear system solver technology that it is today, and self-learning 
AMG eigensolvers and singular triplet solvers will require time and effort to mature 
as well. In addition, new types of application problems often require at least some 
modification in algorithmic components like coarsening schemes. In this sense, the 
present paper is still an early step in the development of AMG methods for singular 
triplets: it presents a framework and one particular way to implement the compo- 
nents, but these components have to be further improved and there are alternative 
candidates for some of them. For example, in the multiplicative phase, it is not al- 
ways easy to find a good choice for the number of relaxations to be done on the 
test vectors. Too many relaxations may lead to linear dependence (and how many 
is too many depends on the a priori not necessarily known gaps in the extrema of 
the spectrum), and not enough relaxations may lead to coarse- level problems that do 
not identify the correct singular triplets. Similarly, the choice of the weight factors in 
the LS fitting is also not straightforward and results may depend on it significantly. 
These aspects need to be improved. Similarly, in the additive phase, the V-cycles 
may not convergence for some of the tentative triplets, and there is no guarantee 
that no triplets are missed (even though we have only rarely observed this). Also, it 
would be interesting to consider special-purpose coarsening routines, for example, for 
principal component analysis data sets. For some applications, one may need mutiplc 
P and Q interpolation operators to fit groups of (possibly overlapping) triplets, or 
coarse grids with multiple degrees of freedom per coarse grid point may need to be 
considered, especially if singular triplets have singular vectors that are very dissim- 
ilar. In the mutiplicative phase, instead of using the BAMG approach, one could 
consider building up interpolation operators that fit the relevant vectors by using the 
so-called 'adaptive' approach from [HJ[H], an d possibly extending it to fit multiple 
target vectors. Similarly, the current multigrid-Ritz additive phase could be replaced 
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by methods of preconditioned inverse iteration, locally optimal block preconditioned 
conjugate gradient, or Rayleigh quotient multigrid type Also, compatible 

relaxation processes may be considered for coarsening [5] [21] . The results presented 
in this paper show initial success and promise for our general approach, but improve- 
ments and modifications of the components are possible and are a topic of continued 
research. 

7. Conclusion. We have described a new algebraic multilevel framework for 
computing dominant and minimal singular triplets. As far as we are aware, this is 
the first algebraic multigrid method that directly tackles the SVD problem, without 
working on A 1 A or the augmented symmetric system. We combine a multiplicative 
phase with an additive phase to obtain a self-learning method that can converge to 
machine accuracy for multiple singular vectors represented collectively using single 
interpolation operators. The self-learning capability of the algorithm makes it appli- 
cable to many types of problems, both for dominant and minimal triplets. We have 
identified a generalized SVD decomposition of a matrix A relative to two SPD ma- 
trices B and C of compatible dimensions as the problem to be solved on the coarse 
levels of our multilevel method, and have stated its existence and uniqueness prop- 
erties and discussed relevant solution methods. Our multiplicative phase follows the 
BAMG framework, as in [27] for SPD cigcnproblems, and our additive phase follows 
a multigrid-Ritz strategy, as in [4] for SPD cigcnproblems. The specialization of our 
combined method to SPD matrices offers a new extension of those existing AMG 
eigensolvers, that allows for highly accurate convergence and is flexible due to its self- 
learning nature. Ongoing work is aimed at improving the parameter-independence 
and robustness of components of the algorithm, and alternative building blocks can 
be considered [TT] [T2J 2] [25] EH] for some of the components in the algorithmic 
framework. Numerical tests using our current implementation showed that conver- 
gence to high accuracy can be obtained in a modest number of V-cycles, and the 
versatility of the approach was illustrated by applying it to problems from different 
domains. 
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